Optimizing Long Intrinsic Disorder Predictors with Protein Evolutionary Information
نویسندگان
چکیده
Protein existing as an ensemble of structures, called intrinsically disordered, has been shown to be responsible for a wide variety of biological functions and to be common in nature. Here we focus on improving sequence-based predictions of long (>30 amino acid residues) regions lacking specific 3-D structure by means of four new neural-network-based Predictors Of Natural Disordered Regions (PONDRs): VL3, VL3H, VL3P, and VL3E. PONDR VL3 used several features from a previously introduced PONDR VL2, but benefitted from optimized predictor models and a slightly larger (152 vs. 145) set of disordered proteins that were cleaned of mislabeling errors found in the smaller set. PONDR VL3H utilized homologues of the disordered proteins in the training stage, while PONDR VL3P used attributes derived from sequence profiles obtained by PSI-BLAST searches. The measure of accuracy was the average between accuracies on disordered and ordered protein regions. By this measure, the 30-fold cross-validation accuracies of VL3, VL3H, and VL3P were, respectively, 83.6 +/- 1.4%, 85.3 +/- 1.4%, and 85.2 +/- 1.5%. By combining VL3H and VL3P, the resulting PONDR VL3E achieved an accuracy of 86.7 +/- 1.4%. This is a significant improvement over our previous PONDRs VLXT (71.6 +/- 1.3%) and VL2 (80.9 +/- 1.4%). The new disorder predictors with the corresponding datasets are freely accessible through the web server at http://www.ist.temple.edu/disprot.
منابع مشابه
A MULTI-OBJECTIVE EVOLUTIONARY ALGORITHM USING DECOMPOSITION (MOEA/D) AND ITS APPLICATION IN MULTIPURPOSE MULTI-RESERVOIR OPERATIONS
This paper presents a Multi-Objective Evolutionary Algorithm based on Decomposition (MOEA/D) for the optimal operation of a complex multipurpose and multi-reservoir system. Firstly, MOEA/D decomposes a multi-objective optimization problem into a number of scalar optimization sub-problems and optimizes them simultaneously. It uses information of its several neighboring sub-problems for optimizin...
متن کاملIdentifying Similar Patterns of Structural Flexibility in Proteins by Disorder Prediction and Dynamic Programming
Computational methods are prevailing in identifying protein intrinsic disorder. The results from predictors are often given as per-residue disorder scores. The scores describe the disorder propensity of amino acids of a protein and can be further represented as a disorder curve. Many proteins share similar patterns in their disorder curves. The similar patterns are often associated with similar...
متن کاملComprehensive large-scale assessment of intrinsic protein disorder
MOTIVATION Intrinsically disordered regions are key for the function of numerous proteins. Due to the difficulties in experimental disorder characterization, many computational predictors have been developed with various disorder flavors. Their performance is generally measured on small sets mainly from experimentally solved structures, e.g. Protein Data Bank (PDB) chains. MobiDB has only recen...
متن کاملContent of intrinsic disorder influences the outcome of cell-free protein synthesis
Cell-free protein synthesis is used to produce proteins with various structural traits. Recent bioinformatics analyses indicate that more than half of eukaryotic proteins possess long intrinsically disordered regions. However, no systematic study concerning the connection between intrinsic disorder and expression success of cell-free protein synthesis has been presented until now. To address th...
متن کاملMobiDB-lite: fast and highly specific consensus prediction of intrinsic disorder in proteins
Motivation Intrinsic disorder (ID) is established as an important feature of protein sequences. Its use in proteome annotation is however hampered by the availability of many methods with similar performance at the single residue level, which have mostly not been optimized to predict long ID regions of size comparable to domains. Results Here, we have focused on providing a single consensus-b...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of bioinformatics and computational biology
دوره 3 1 شماره
صفحات -
تاریخ انتشار 2005